AITopics | video recognition

Collaborating Authors

video recognition

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

GenRec: UnifyingVideoGenerationandRecognition withDiffusionModels

Neural Information Processing SystemsFeb-18-2026, 01:00:13 GMT

In particular, GenRec achieves competitive recognition performance, offering 75.8% and 87.2% accuracy on SSV2andK400,respectively.

artificial intelligence, arxivpreprintarxiv, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.68)

Add feedback

MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge Transfer

Neural Information Processing SystemsFeb-15-2026, 11:29:24 GMT

Code is available at https://github.com/ZMHH-H/MoTE .

category, knowledge management, machine learning, (15 more...)

Neural Information Processing Systems

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

LiteEval: A Coarse-to-Fine Framework for Resource Efficient Video Recognition

Zuxuan Wu, Caiming Xiong, Yu-Gang Jiang, Larry S. Davis

Neural Information Processing SystemsFeb-13-2026, 21:21:54 GMT

Neural Information Processing Systems http://nips.cc/

computation, fine feature, prediction, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland (0.04)
North America > Canada (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

Add feedback

bd853b475d59821e100d3d24303d7747-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-13-2026, 21:21:40 GMT

artificial intelligence, computation, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.31)

Add feedback

f723b0024f2b843572420b42312a9ed4-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 21:57:54 GMT

low-resolution frame, recognition, video recognition, (11 more...)

Neural Information Processing Systems

Country: Asia > China > Hong Kong (0.05)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

c6e954799a0218f6d341ad5cbfb58999-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 20:49:15 GMT

Invideo recognition, weneedtosample multiple frames torepresent eachvideo which makesthe computational cost scale proportionally to the number of sampled frames. In most cases, a small proportion of all the frames is sampled for each input, which only contains limited information of the original video.

afnet, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.69)

Add feedback

70efdf2ec9b086079795c442636b55fb-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 08:06:32 GMT

covariance, recognition, representation, (16 more...)

Neural Information Processing Systems

Country:

Asia > China > Tianjin Province > Tianjin (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Asia > China > Liaoning Province > Dalian (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Look More but Care Less in Video Recognition

Neural Information Processing SystemsDec-25-2025, 06:10:16 GMT

Existing action recognition methods typically sample a few frames to represent each video to avoid the enormous computation, which often limits the recognition performance. To tackle this problem, we propose Ample and Focal Network (AFNet), which is composed of two branches to utilize more frames but with less computation. Specifically, the Ample Branch takes all input frames to obtain abundant information with condensed computation and provides the guidance for Focal Branch by the proposed Navigation Module; the Focal Branch squeezes the temporal size to only focus on the salient frames at each convolution block; in the end, the results of two branches are adaptively fused to prevent the loss of information. With this design, we can introduce more frames to the network but cost less computation. Besides, we demonstrate AFNet can utilize less frames while achieving higher accuracy as the dynamic selection in intermediate features enforces implicit temporal modeling. Further, we show that our method can be extended to reduce spatial redundancy with even less cost. Extensive experiments on five datasets demonstrate the effectiveness and efficiency of our method.

computation, name change, video recognition, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.40)

Add feedback

Temporal-attentive Covariance Pooling Networks for Video Recognition

Neural Information Processing SystemsDec-24-2025, 07:09:04 GMT

For video recognition task, a global representation summarizing the whole contents of the video snippets plays an important role for the final performance. However, existing video architectures usually generate it by using a simple, global average pooling (GAP) method, which has limited ability to capture complex dynamics of videos. For image recognition task, there exist evidences showing that covariance pooling has stronger representation ability than GAP. Unfortunately, such plain covariance pooling used in image recognition is an orderless representative, which cannot model spatio-temporal structure inherent in videos. Therefore, this paper proposes a Temporal-attentive Covariance Pooling (TCP), inserted at the end of deep architectures, to produce powerful video representations.

covariance representation, representation, temporal-attentive covariance pooling network, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.98)

Add feedback